Eagle job-aware scheduling: divide and ... reorder
نویسندگان
چکیده
We present Eagle, a new hybrid cluster scheduler for data-parallel programs, consisting of a centralized scheduler for long jobs and a set of distributed schedulers for short jobs. Eagle incorporates two new techniques: succinct state sharing and sticky batch probing. With succinct state sharing, the centralized scheduler informs the distributed schedulers of the placement of long jobs in a low-overhead way. The distributed schedulers then avoid worker nodes with long jobs to minimize head-of-line blocking. Combined with a small, dedicated partition for short jobs, succinct state sharing entirely eliminates head-of-line blocking of short jobs by long jobs. With sticky batch probing, the distributed schedulers queue probes for their tasks at various worker nodes, but when a worker node finishes a task, rather than executing the next task in its queue, it requests a new task from a distributed scheduler according to the desired scheduling discipline. We use sticky batch probing to implement a distributed approximation of SRPT (Shortest Remaining Processing Time) with starvation prevention. We have implemented Eagle as a Spark plugin, and we have measured job completion times for a subset of the Google trace on a 100-node cluster for a variety of cluster loads. We show that Eagle improves at all percentiles over Hawk, an earlier hybrid scheduler with which it shares a code base. We provide simulation results for larger clusters, different traces, and for comparison with other scheduling policies. Using traces from Cloudera, Google and Yahoo, we show that Eagle outperforms other scheduling disciplines at most percentiles, and is more robust against mis-estimation of task duration.
منابع مشابه
Eagle: A Better Hybrid Data Center Scheduler
Eagle is a new hybrid data center scheduler that considerably improves the job completion times for short jobs. Eagle builds on the Hawk hybrid scheduler, using a centralized scheduler for long jobs and distributed schedulers for short jobs. The main innovation in Eagle is that it provides an approximate and potentially slightly outof-date summary of the centralized scheduler state to the distr...
متن کاملFficient S Cheduling S Trategy Using C Ommunication a Ware S Cheduling for P Arallel J Obs in C Lusters
In the area of Computer Science, Parallel job scheduling is an important field of research. Finding a best suitable processor on the high performance or cluster computing for user submitted jobs plays an important role in measuring system performance. A new scheduling technique called communication aware scheduling is devised and is capable of handling serial jobs, parallel jobs, mixed jobs and...
متن کاملGuides to Inventory Policy: Functions and Lot Sizes
But this is only one of the characteristic problems business managers face in dealing with production planning, scheduling, keeping inventories in hand, and expediting. Other questions just as perplexing and baffling when managers approach them on the basis of intuition and pencil work alone-are: How often should we reorder, or how should we adjust production, when sales are uncertain? What cap...
متن کاملSecurity Aware Parallel and Independent Job Scheduling in Grid Computing Environments Based on Adaptive Job Replication
In grid environment, jobs may be scheduled to multiple machines across different administrative domains. However, grid security is a main hurdle to make the job scheduling decision secure, reliable and fault tolerant. A security-aware parallel and independent job scheduling algorithm in grid computing environment based on adaptive job replications was proposed. In risky and failure-prone grids,...
متن کاملEnergy Efficiency of Thermal-Aware Job Scheduling Algorithms under Various Cooling Models
One proposed technique to reduce energy consumption of data centers is thermal-aware job scheduling, i.e. job scheduling that relies on predictive thermal models to select among possible job schedules to minimize its energy needs. This paper investigates, using a more realistic linear cooling model, the energy savings of previously proposed thermal-aware job scheduling algorithms, which assume ...
متن کامل